A framework for improved video text detection and recognition
Identifieur interne : 000139 ( Main/Exploration ); précédent : 000138; suivant : 000140A framework for improved video text detection and recognition
Auteurs : HAOJIN YANG [Allemagne] ; Bernhard Quehl [Allemagne] ; Harald Sack [Allemagne]Source :
- Multimedia tools and applications [ 1380-7501 ] ; 2014.
Descripteurs français
- Pascal (Inist)
- Signal vidéo, Reconnaissance caractère, Texte, Reconnaissance forme, Recherche information, Traitement image, Indexation, Vision ordinateur, Bibliothèque électronique, Vidéothèque, Collecticiel, Workflow, Sémantique, Processus métier, Rappel, Taux fausse alarme, Classification à vaste marge, Localisation, ..
- Wicri :
- topic : Vidéothèque.
English descriptors
- KwdEn :
Abstract
Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT) - and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000006
- to stream PascalFrancis, to step Curation: 000759
- to stream PascalFrancis, to step Checkpoint: 000022
- to stream Main, to step Merge: 000140
- to stream Main, to step Curation: 000139
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">A framework for improved video text detection and recognition</title>
<author><name sortKey="Haojin Yang" sort="Haojin Yang" uniqKey="Haojin Yang" last="Haojin Yang">HAOJIN YANG</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Quehl, Bernhard" sort="Quehl, Bernhard" uniqKey="Quehl B" first="Bernhard" last="Quehl">Bernhard Quehl</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Sack, Harald" sort="Sack, Harald" uniqKey="Sack H" first="Harald" last="Sack">Harald Sack</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">14-0217177</idno>
<date when="2014">2014</date>
<idno type="stanalyst">PASCAL 14-0217177 INIST</idno>
<idno type="RBID">Pascal:14-0217177</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000006</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000759</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000022</idno>
<idno type="wicri:doubleKey">1380-7501:2014:Haojin Yang:a:framework:for</idno>
<idno type="wicri:Area/Main/Merge">000140</idno>
<idno type="wicri:Area/Main/Curation">000139</idno>
<idno type="wicri:Area/Main/Exploration">000139</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">A framework for improved video text detection and recognition</title>
<author><name sortKey="Haojin Yang" sort="Haojin Yang" uniqKey="Haojin Yang" last="Haojin Yang">HAOJIN YANG</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Quehl, Bernhard" sort="Quehl, Bernhard" uniqKey="Quehl B" first="Bernhard" last="Quehl">Bernhard Quehl</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Sack, Harald" sort="Sack, Harald" uniqKey="Sack H" first="Harald" last="Sack">Harald Sack</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
<wicri:noRegion>Prof.-Dr.-Helmert Str. 2-4</wicri:noRegion>
<wicri:noRegion>14467 Potsdam</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Multimedia tools and applications</title>
<title level="j" type="abbreviated">Multimed. tools appl.</title>
<idno type="ISSN">1380-7501</idno>
<imprint><date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Multimedia tools and applications</title>
<title level="j" type="abbreviated">Multimed. tools appl.</title>
<idno type="ISSN">1380-7501</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Business process</term>
<term>Character recognition</term>
<term>Computer vision</term>
<term>Electronic library</term>
<term>False alarm rate</term>
<term>Groupware</term>
<term>Image processing</term>
<term>Indexing</term>
<term>Information retrieval</term>
<term>Localization</term>
<term>Pattern recognition</term>
<term>Recall</term>
<term>Semantics</term>
<term>Text</term>
<term>Vector support machine</term>
<term>Video library</term>
<term>Video signal</term>
<term>Workflow</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Signal vidéo</term>
<term>Reconnaissance caractère</term>
<term>Texte</term>
<term>Reconnaissance forme</term>
<term>Recherche information</term>
<term>Traitement image</term>
<term>Indexation</term>
<term>Vision ordinateur</term>
<term>Bibliothèque électronique</term>
<term>Vidéothèque</term>
<term>Collecticiel</term>
<term>Workflow</term>
<term>Sémantique</term>
<term>Processus métier</term>
<term>Rappel</term>
<term>Taux fausse alarme</term>
<term>Classification à vaste marge</term>
<term>Localisation</term>
<term>.</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Vidéothèque</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT) - and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.</div>
</front>
</TEI>
<affiliations><list><country><li>Allemagne</li>
</country>
</list>
<tree><country name="Allemagne"><noRegion><name sortKey="Haojin Yang" sort="Haojin Yang" uniqKey="Haojin Yang" last="Haojin Yang">HAOJIN YANG</name>
</noRegion>
<name sortKey="Quehl, Bernhard" sort="Quehl, Bernhard" uniqKey="Quehl B" first="Bernhard" last="Quehl">Bernhard Quehl</name>
<name sortKey="Sack, Harald" sort="Sack, Harald" uniqKey="Sack H" first="Harald" last="Sack">Harald Sack</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000139 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000139 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Pascal:14-0217177 |texte= A framework for improved video text detection and recognition }}
This area was generated with Dilib version V0.6.32. |